A Surrogate Variable-Based Data Mining Method Using CFS and RSM
نویسندگان
چکیده
In many scientific and engineering fields, there are a number of data sets uncontrollable and hard to handle because the nature of measurement of a performance variable may often be destructive or very expensive, which are known as sets of noise factors. Although these noise factors, which may not be controlled by manufacturing and cost reasons, are merged as a key problem of data mining (DM) and analysis, most DM methods may not discuss robustness of solutions either by considering noise factors or by incorporating specific statistical inferences. In order to address this problem, the primary objective of this paper is to propose a integrated approach,called surrogate variable-based data mining method (SVDM), which can conduct dimensionality reduction by exacting the significant factors from the row data sets by applying correlation-based feature selection (CFS). The proposed method then incorporates noise factor consideration to achieve robustness of an analysis by using the principle of surrogate variable. In addition, this proposed method is far more effective when a 100% inspection and a destructive characteristic/response are considered. Finally, response surface methodology (RSM), which is a statistical tool that is useful for modeling and analysis in situations where the response of interest is affected by several input factors, is used for further statistical analyses. . Key-Words: Data mining, Surrogate variable, Correlation-based feature selection (CFS), Response surface methodology (RSM)
منابع مشابه
A stratified sampling technique based on correlation feature selection method for heart disease risk prediction system
In medical, data mining method can be utilized by the physicians to improve clinical diagnosis. In this paper a stratified approach named Correlation Feature Selection Stratified Sampling (CFS-SS) has been introduced. This method is applied to medical diagnosis heart disease risk prediction system. By using this proposed system the attributes are grouped together into homogenous sub groups, bef...
متن کاملMultivariate Estimation of Rock Mass Characteristics Respect to Depth Using ANFIS Based Subtractive Clustering- Khorramabad- Polezal Freeway Tunnels
Combination of Adoptive Network based Fuzzy Inference System (ANFIS) and subtractive clustering (SC) has been used for estimation of deformation modulus (Em) and rock mass strength (UCSm) considering depth of measurement. To do this, learning of the ANFIS based subtractive clustering (ANFISBSC) was performed firstly on 125 measurements of 9 variables such as rock mass strength (UCSm), deformati...
متن کاملData mining software using fuzzy inference systems at the World Wide Web
In this paper we present a software tool called Fuzzy_Query.Web (FQW), which is a web-based software interface that allows users to extract knowledge from a database using Classification Fuzzy Systems (CFS) and Database Manager System (DBMS). The FQW tool is used as an interpreter for several system structures that use CFS and various techniques of Artificial Intelligence (AI). Indexes for anal...
متن کاملExperiences of Commissioning Mothers in Selection of Surrogate Mother
Background: The practice of surrogacy is one of the most controversial procedures in infertility treatment. Despite increasing of using this technology in Iran, there are few practical data about surrogacy.There isn't any study assessing experiences of commissioning mothers about surrogate mother selection. Aim: The purpose of this study was exploring of commissioning mothers' experiences in se...
متن کاملDiagnosis of diabetes by using a data mining method based on native data
Background & Aim: Detecting the abnormal performance of diabetes and subsequently getting proper treatment can reduce the mortality associated with the disease. Also, timely diagnosis will result in irreversible complications for the patient. The aim of this study was to determine the status of diabetes mellitus using data mining techniques. Methods: This is an analytical study and its databas...
متن کامل